Multiagent Low-Dimensional Linear Bandits
نویسندگان
چکیده
We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector $\theta ^* \in \mathbb {R}^{d}$ . The information consists of finite collection low-dimensional subspaces, one which contains ^*$ In our setting, agents can collaborate to reduce regret sending recommendations across communication graph connecting them. present novel decentralized algorithm, where communicate subspace indices each other and agent plays projected variant LinUCB on the corresponding (low dimensional) subspace. By distributing search for optimal users learning in subspace, we show that per-agent finite-time is much smaller than case when do not communicate. finally complement these results through simulations.
منابع مشابه
Conservative Contextual Linear Bandits
Safety is a desirable property that can immensely increase the applicability of learning algorithms in real-world decision-making problems. It is much easier for a company to deploy an algorithm that is safe, i.e., guaranteed to perform at least as well as a baseline. In this paper, we study the issue of safety in contextual linear bandits that have application in many different fields includin...
متن کاملMisspecified Linear Bandits
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potentia...
متن کاملStructured Stochastic Linear Bandits
The stochastic linear bandit problem proceeds in rounds where at each round the algorithm selects a vector from a decision set after which it receives a noisy linear loss parameterized by an unknown vector. The goal in such a problem is to minimize the (pseudo) regret which is the difference between the total expected loss of the algorithm and the total expected loss of the best fixed vector in...
متن کاملStochastic Low-Rank Bandits
Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are un...
متن کاملStructured Stochastic Linear Bandits (DRAFT)
In this paper, we consider the structured stochastic linear bandit problem which is a sequential decision making problem where at each round t the algorithm has to select a p-dimensional vector xt from a convex set after which it observes a loss `t(xt). We assume the loss is a linear function of the vector and an unknown parameter θ∗. We consider the problem when θ∗ is structured which we chara...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2023
ISSN: ['0018-9286', '1558-2523', '2334-3303']
DOI: https://doi.org/10.1109/tac.2022.3179521